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ABSTRACT 

We describe our methodology for comparing the WMAP measurements of the cosmic 
microwave background (CMB) and other complementary data sets to theoretical models. 
The unprecedented quality of the WMAP data, and the tight constraints on cosmological 
parameters that are derived, require a rigorous analysis so that the approximations made 
in the modeling do not lead to significant biases. 

We describe our use of the likelihood function to characterize the statistical proper- 
ties of the microwave background sky. We outline the use of the Monte Carlo Markov 
Chains to explore the likelihood of the data given a model to determine the best fit 
cosmological parameters and their uncertainties. 

We add to the WMAP data the £ > 700 CBI and ACBAR measurements of the 
CMB, the galaxy power spectrum at z ~ obtained from the 2dF galaxy redshift survey 
(2dFGRS), and the matter power spectrum at z ~ 3 as measured with the Lyman a 
forest. These last two data sets complement the CMB measurements by probing the 
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matter power spectrum of the nearby universe. Combining CMB and 2dFGRS requires 
that we include in our analysis a model for galaxy bias, redshift distortions, and the non- 
linear growth of structure. We show how the statistical and systematic uncertainties in 
the model and the data are propagated through the full analysis. 



1. INTRODUCTION 

CMB experiments are powerful cosmological probes because the early universe is particularly 
simple and because the fluctuations over angular scales 9 > 0°.2 are described by linear theory 
(Peebles & Yu 1970; Bond & Efstathiou 1984; Zaldarriaga & Seljak 2000). Exploiting this simplicity 
to obtain precise constraints on cosmological parameters requires that we accurately characterize 
the performance of the instrument (Jarosik et al. 2003b; Page et al. 2003a; Barnes et al. 2003; 
Hinshaw et al. 2003a), the properties of the foregrounds (Bennett et al. 2003c), and the statistical 
properties of the microwave sky. 

The primary goal of this paper is to present our approach to extracting the cosmological 
parameters from the temperature-temperature angular power spectrum (TT) and the temperature- 
polarization angular cross-power spectrum (TE). In companion papers, we present the TT (Hinshaw 
et al. 2003b) and TE (Kogut et al. 2003) angular power spectra and show that the CMB fluctuations 
may be treated as Gaussian (Komatsu et al. 2003). 

Our basic approach is to constrain cosmological parameters with a likelihood analysis first 
of the WMAP TT and TE spectra alone, then jointly with other CMB angular power spectrum 
determinations at higher angular resolution, and finally of all CMB power spectra data jointly with 
the power spectrum of the large-scale structure (LSS). In §2 we describe the use of the likelihood 
function for the analysis of microwave background data. This builds on the Hinshaw et al. (2003b) 
methodology for determining the TT spectrum and its curvature matrix, and Kogut et al. (2003) 
who describe our methodology for determining the TE spectrum. In §3 we describe our use of 
Markov Chains Monte Carlo (MCMC) techniques to evaluate the likelihood function of model 
parameters. While WMAP^s measurements are a powerful probe of cosmology, we can significantly 
enhance their scientific value by combining the WMAP data with other astronomical data sets. 
This paper also presents our approach for including external CMB data sets (§4), LSS data (§5) 
and Lyman a forest data (§6). When including external data sets the reader should keep in mind 
that the physics and the instrumental effects involved in the interpretation of these external data 
sets (especially 2dFGRS and Lyman a ) are much more complicated and less well understood than 
for WMAP data. Nevertheless we aim to match the rigorous treatment of uncertainties in the 
WMAP angular power spectrum with the inclusion of known statistical and systematic effects (of 
the data and of the theory), in the complementary data sets. 
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2. LIKELIHOOD ANALYSIS OF WMAP ANGULAR POWER SPECTRA 

The first goal of our analysis program is to determine the values and confidence levels of the 
cosmological parameters that best describe the WMAP data for a given cosmological model. We 
also wish to discriminate between different classes of cosmological models, in other words to assess 
whether a cosmological model is an acceptable fit to WMAP data. 

The ultimate goal of the likelihood analysis is to find a set of parameters that give an estimate 
of {Ci), the ensemble average of which the realization on our sky-*^^ is Cf^^ . The likelihood function, 

£,{Ce\Cf^{d)), yields the probability of the data given a model and its parameters (a). In our 
notation Ci denotes our best estimator of Cf^^ (Hinshaw et al. 2003b) and Cf^ is the theoretical 
prediction for angular power spectrum. From Bayes' Theorem, we can split the expression for the 
probability of a model given the data as: 

V{a\Ce) = jr{Ce\Cl\a))V{a), (1) 

where P(a) describes our priors on cosmological parameters and we have neglected a normalization 
factor that does not depend on the parameters . Once the choice of the priors are specified, our 
estimator of (Q) is given by Cf^ evaluated at the maximum of P{d\Ci). 



2.1. Likelihood Function 

One of the generic predictions of inflationary models is that fluctuations in the gravitational 
potential have Gaussian random phases. Since the physics that governs the evolution of the tem- 
perature and metric fluctuations is linear, the temperature fluctuations are also Gaussian. If we 
ignore the effects of non-linear physics at 2 < 10 and the effect of foregrounds, then all of the 
cosmological information in the microwave sky is encoded in the themperature and polarization 
power spectra. The leading-order low-redshift astrophysical effect is expected to be gravitational 
lensing of the CMB by foreground structures. We ignore this effect here as it generates a < 1% 
covariance in the TT angular power spectrum on WMAP angular scales (Hu 2001) (see Spergel et 
al. 2003, §3). 

There are several expected sources of non-cosmological signal and of non-Gaussianity in the 
microwave sky. The most significant sources on the full sky are Galactic foreground emission, 
radio sources, and galaxy clusters. Bennett et al. (2003c) show that these contributions are greatly 
reduced if we restrict our analysis to a cut sky that masks bright sources and regions of bright 
Galactic emission. The residual contribution of these foregrounds is further reduced by the use 
external templates to subtract foreground emission from the Q, V and W band maps. Komatsu 
et al. (2003) find no evidence for deviations from Gaussianity on this template-cleaned cut sky. 



^^Throughout this paper we use the convention that Ce = £{i + l)Ce/{2n). 
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While the sky cut greatly reduces foreground emission, it has the unfortunate effect of coupling 
multipole modes on the sky so that the power spectrum covariance matrix is no longer diagonal. 
The goal of this section is to include this covariance in the likelihood function. 

The likelihood function for the temperature fluctuations observed by a noiseless experiment 
with full sky coverage has the form: 



C{T\C, 



exp 



oc 



(rs-ir)/2 



VdetS 



(2) 



where T denotes our temperature map; and Sij = ^^(2^ + l)Cf^P£{hi ■ hj)/{47r), where the Pc arc 
the Legendre polynomials and hi is the pixel position on the map. If we expand the temperature 
map in spherical harmonics: T{h) = "^^j^aimYerm then the likelihood function for each aim has a 
simple form: 

£m y 

Since we assume that the universe is isotropic, the likelihood function is independent of m. Thus, 
we can sum over m and rewrite the likelihood function as 



-2 In/: = ^(2£+ 1) 



In 



nth 



+ -1 



(4) 



up to an irrelevant additive constant. Here, for a full sky, noiseless experiment we have identified 
X^rn |o^^mP/(2^ + 1) with Q. Note that the likelihood function depends only on the angular power 
spectrum. In this limit, the angular power spectrum encodes all of the cosmological information in 
the CMB. 

Characteristics of the instrument are also included in the likelihood analysis. Jarosik et al. 
(2003a) show that the detector noise is Gaussian (see their Figure 6 and §3.4); consequently the 
pixel noise in the sky map is also Gaussian (Hinshaw et al. 2003b). The resolution of WMAP is 
quantified with a window function, wi (Page et al. 2003b). Thus, the likelihood function for our 
CMB map has the same form as equation (2), but with S replaced by C = S + N where N is the 
nearly diagonal noise correlation matrix-''^ and Sij = YleC^^ ^ 'i^)Cf^wePe{hi ■ hj)/{4-ir). 

If foreground removal did not require a sky cut and the noise were uniform and purely diagonal, 
then the likelihood function for the WMAP experiment would have the form (Bond et al. 2000): 



-21n£ = ^(2£ + 1) 



In 



+ 



1 



(5) 



^^1// noise makes a non-random phase contribution to the detector noise and leads to ofF-diagonal terms in the 
noise matrix. By making the noise No a function of I (denoted by Nt) we include this effect to leading order (Hinshaw 
et al. 2003) 
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whcre the effective bias J\f^ is related to the noise bias A^^ as M(, = Ni/w^i{i + l)/(27r) and Q = 
l{l + 1) / {2tt) "^^im k'^mP/(2^+ Note that J\f^ and Cf^ appear together in equation (5) because 
the noise and cosmological fluctuations have the same statistical properties, they both are Gaussian 
random fields. 

Because of the foreground sky-cut, different multipoles are correlated and only a fraction of the 
sky, /sky, is used in the analysis. In this case, it becomes computationally prohibitive to compute the 
exact form of the likelihood function. There are several different approximations used in the CMB 
literature for the likelihood function. At large ^, equation (5) is often approximated as Gaussian: 

ln/:Gauss oc Y^^Cf - Ci)Qa'{Cf - Ce) , (6) 

where Q«', the curvature matrix, is the inverse of the power spectrum covariance matrix. 

The power spectrum covariance encodes the uncertainties in the power spectrum due to cosmic 
variance, detector noise, point sources, the sky cut, and systematic errors. Hinshaw et al. (2003a) 
and §(2.2) describe the various terms that enter into the power spectrum covariance matrix. 

Since the likelihood function for the power spectrum is slightly non-Gaussian, equation (6) is 
a systematically biased estimator. Bond et al. (2000) suggest using a lognormal distribution, >Cln 
(Bond et al. 2000; Sievers et al. 2002): 

-21n/:LN = Y.^zf - Zi)Qu'{zf - ze), (7) 

where zf^ = ln(C^*' + M), ze = ln{Ci + Me) and Qw is the local transformation of the curvature 
matrix Q to the lognormal variables Zi, 

Qw = (Ce + Mi)Qee (Ce + Me). (8) 

We find that, for the WMAP data, both equations (6) and (7) are biased estimators. We use an 
alternative approximation of the likelihood function for the C^s (equation 11) motivated by the 
following argument. 

We can expand the exact expression for the likelihood (equation 4) around its maximum by 
writing Ce = (1 + e). Then, for a single multipole £, 

-21n£,= (2£ + l)[e-ln(l + 6)]c.(2^+l)(|-| + C>(6^)) . (9) 



We note that the Gaussian likelihood approximation is equivalent to the above expression truncated 

-•th 
■■I 



at e^: -21n/:Gauss/ oc {21 + 1)/2[(C, - Cf)ICff ~ (2£ + l)eV2. 



The Bond et al. (1998) expression for the lognormal likelihood for the equal variance approxi- 
mation is 

(2£+i: 
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Thus our approximation of likelihood function is given by the form, 

1 2 

In-C = - In/Icauss + 2 Iii-Cln , (H) 
where jC'i^^ has the form of equation (7) apart from Q^' that is not given by equation (8) but by 

Qee' = {Cf'+AfMi^iCf" +Afi,). (12) 

We tested this form of the likelihood by making 100,000 full sky realizations of the TT angular 
power spectrum Cf^. For each realization, the maximum likelihood amplitude of fluctuations in 
the underlying model was found and the mean value was computed. Since we kept all other model 
parameters fixed, this one dimensional maximization was computationally trivial. The Gaussian 
approximation (equation 6) was found to systematically overestimate the amplitude of the fluctu- 
ations by ~ 0.8%, while the lognormal approximation underestimates it by ~ 0.2%. Equation (11) 
was found to be accurate to better than 0.1%. 



2.2. Curvature Matrix 

We obtain the curvature matrix in a form that can be used in the likelihood analysis from 
the power spectrum covariance matrix for Q computed in Hinshaw et al. (2003a). The matrix is 
composed of several terms of the following form: 

T,^/ = V DeD^f [5ffi, - rw) + e^^', (13) 

where e^^i is the coupling introduced by the beam uncertainties and point sources subtraction 
{ew = if £ = ^'), (5^ denotes the Kronecker delta function, and denotes the diagonal terms, 

_ JC«'+M,f- ,,,, 

The quantity rw encodes the mode coupling due to the sky-cut and is the dominant off-diagonal 
term (it is set to be if £ = i"). The mode-coupling coefficient, r^i, is most easily defined in terms 
of the curvature matrix, Qu' = + rw / \J D^D^i (see Hinshaw et al. 2003^^). 

The sky cut has two significant effects on the power spectrum covariance matrix. Because 
less data is used, the covariance matrix is increased by a factor of /sky An additional factor of 
/sky arises from the coupling to nearby (. modes. The additional term does not lead to a loss of 
information as nearby I modes are slightly anti-correlated. 

Hinshaw et al. (2003b) describe the beam uncertainty and point source terms included in A/^ 
and fj,^!. The beam and calibration uncertainties depend on the realization of the angular power 

^^In this equation we have set to zero the beam and point sources uncertainties. This is because the coupUng 
coefficient is computed for an ideal cut sky. 
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spectrum on the sky Cf^^ , not on the theoretical angular power spectrum Cf^, thus they should not 
change as, in exploring the likelihood surface, we change Cf^ in the expression for D^. This differs 
from other approaches (e.g.. Bridle ct al. (2002)). Rescaling all the contributions to the off diagonal 
terms in the covariance matrix with Cf^ is not correct and leads to a 2% bias in our estimator of 
(Q) which propagates, for example, into a ^ 2/^ error on the matter density parameter or 
~ 2% error on the spectral slope Ug- 

We find the curvature matrix by inverting equation (13) 

Q,, = Df5^, - + , (15) 

where we have assumed that the off-diagonal terms are small. For cosmological models that have 
Cf^ very different from the best fit Ce, equation 15 does not yield the inverse of (13): in these cases 
the inversion of needs to be computed explicitely. 

We do not propagate the WMAP 0.5% calibration uncertainty in the covariance matrix as 
this uncertainty does not affect cosmological parameters determinations. This systematic only 
affects the power spectrum amplitude constraint at the 0.5% level, while the statistical error on 
this quantity is ~ 10%. 



2.2.1. Calibration with Monte Carlo Simulations 

The angular power spectrum is computed using three different weightings: uniform weighting 
in the signal-dominated regime (i < 200), an intermediate weighting scheme for 200 < £ < 450, and 
Nobs weighting (for the noise-dominated regime 450 < i < 900 (Hinshaw et al. 2003b)). Uniform 
weighting is a minimum variance weighting in the signal-dominated regime and Nobs weighting 
is a minimum variance in the noise dominated regime. However, in the intermediate regime the 
weighting schemes arc not necessarily optimal and the analytic expression for the covariance matrix 
might thus underestimate the errors. To ensure that we have the appropriate errors, we calibrate 
the covariance matrix from 100,000 Monte Carlo realizations of the sky with the WMAP noise level, 
symmetrized beams and the Kp2 sky cut. A good approximation of the curvature matrix can be 
obtained by using equations (13)-(15), but substituting A/"^ and /sky with J\f^^ and f^^^ calibrated 
from the Monte Carlo simulations, as shown in Figures 1 and 2. 

We find that for £ < 200 the weighting scheme is nearly optimal. The power spectrum covari- 
ance matrix (13) gives a correct estimate of the error bars, thus we do not need to calibrate Me 
or /sky We have computed an effective reduced chi-squared^^ Xeff/^ — — 21n£/zv where u is the 
number of degrees of freedom. The effective reduced chi-squared from the Monte-Carlo simulations 
in this £ range is consistent with unity. 



This is not exactly the reduced chi-squared because the likelihood is non-gaussian especially at low i. 
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In the intermediate regime our ansatz power spectrum covariance matrix (equation (13)) 

slightly underestimates the errors. This can be corrected by computing the covariance matrix 
for an effective fraction of the sky /^j^ as shown in Figure 1. The jagged line is the ratio obtained 
from the Monte Carlo simulations, the smooth curve shows the fit to f^^^ we adopt, 

feS 

= 0.813 + 0.001914£ - 7.405 X 10-^£^ + 8.65 X 10-^£^ (for 200 <£< 450). (16) 

/sky 

For i > 450, in the noise-dominated regime, the weighting is asymptotically optimal for / — > 
oo. However, since we are using a smaller fraction of the sky, we need again the correct the /sky 
factor. This numerical factor describes the reduction in effective sky coverage due to weighting 
the well observed ecliptic poles more heavily than the ecliptic plane (see Figure 3 of Bennett et al. 

(2003b)). We fit this factor to the numerical simulations of the TT spectrum covariance matrix. 
Kogut et al. (2003) notes that this same factor is also a good fit to the Monte Carlo simulations 
of the TE spectrum covariance matrix. For the noise-dominated regime, we define an effective sky 

fraction f^^^ = /sky/1.14 and an effective noise given by Mf = ^J (^fg^^ ^ (2£ + l)/2 - C 

which can be obtained from the noise bias of the maps A/^ by a noise correction factor J\f^^ /M^- 
This is shown in Figure 2 where the smooth curve is the fit we adopt to this correction factor, 

^/-efr 

= 1.046 -0.0002346(£- 450) + 3.204 X 10-'^(£- 450)^ for £ > 450 . (17) 

This calibration of the covariance matrix from the Monte Carlo simulations allows us to use 
the effective reduced chi-squared as a tool to assess goodness of fit. It can also be used to determine 
the relative likelihood of different models (e.g., Peiris et al. 2003). 



Sim 



2.3. Likelihood for the TE angular power spectrum 

Since the TE signal is noise-dominated, we adopt a Gaussian likelihood, where the curvature 
matrix is given by 

Qlf = {DD-Xe + rl?/4^r^ . (18) 

The expression for DJ^ is given by equation (10) of Kogut et al. (2003), and the coupling coefficient 
due to the sky cut, rj^ , is obtained from 100,000 Monte Carlo realizations of the sky with WMAP 
mask and noise level. The TE spectrum is computed with noise inverse weighting; in this regime 
depends only on the difference /S.^ = i — I' and is set to be at separations > 15. We use all 
multipoles 2 < £ < 450, as comparison with the Monte Carlo realizations shows that in this regime 
equation (18) correctly estimates the TE uncertainties. We have also verified on the simulations 
that the Gaussian likelihood is an unbiased estimator, and that the effective reduced is centered 
around 1. 
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The amplitude of the covariance between TT and TE power spectra is ~ f/(l + nf /Cf ) 
where r is the correlation term {Cf^f{Cf^Cj^y^ ~ 0.2. Since Cf^/nf^ « 0.25 for 1-yr 
data, wc neglect this term, but we will include it in the 2+ yr analysis as it becomes increasingly 
important. 

We provide a subroutine that reads in a set of Cf^ (TT, or TE or both) and returns the 
likelihood for the WMAP dataset including all the effects described in this section. The routine is 
available at http://lambda.gsfc.nasa.gov. 



3. MARKOV CHAINS MONTE CARLO LIKELIHOOD ANALYSIS 

The analysis described in Spergel et al. (2003) and Peiris et al. (2003) is numerically demand- 
ing. At each point in the six or more dimensional parameter space a new model from CMBFAST^^ 
(Seljak Sz Zaldarriaga 1996) is computed. Our version of the code incorporates a number of cor- 
rections and uses the RECFAST (Seager et al. 1999) recombination routine. Most of the likelihood 
calculations were done with four shared memory 32 CPU SGI Origin 300 with 600 MHz processors. 
With 8 processors per calculation, each evaluation of C MB FAST for i < 1500 for a flat reionized 
A dominated universe requires 3.6 seconds. (The scaling is not linear; with 32 processors each 
evaluation requires 1.62 seconds.) 

A grid-based likelihood analysis would have required prohibitive amounts of CPU time. For 
example, a coarse grid (~ 20 grid points per dimension) with six parameters requires ~ 6.4 x 10^ 
evaluations of the power spectra. At 1.6 seconds per evaluation, the calculation would take ~ 
1200 days. Christensen k Meyer (2000) proposed using Markov Chain Monte Carlo (MCMC) to 
investigate the likelihood space. This approach has become the standard tool for CMB analyses 
(e.g., Christensen et al. 2001; Knox et al 2001; Lewis & Bridle 2002; Kosowsky et al. 2002) and is 
the backbone of our analysis effort. For a flat reionized A dominated universe, we can evaluate the 
likelihood ~ 120, 000 times in < 2 days using four sets of eight processors. As we explain below, 
this is adequate for finding the best fit model and for reconstructing the 1- and 2-a confidence levels 
for the cosmological parameters. 

We refer the reader to Gilks et al. (1996) for more information about MCMC. Here, we will 
only provide a brief introduction to the subject and concentrate on the issue of convergence. 



We used the parallelized version 4.1 of CMBFAST developed in collaboration with Uros Seljak and Matias Zal- 
darriaga. 



-10- 



3.1. Markov Chain Monte Carlo 

MCMC is a method to simulate posterior distributions. In particular, we simulate observations 
from the posterior distribution V{a\x), of a set of parameters a given event x, obtained via Bayes' 
Theorem, 

V{x\a)V{a) 

' ^ fV{x\a)V{a)da' ^ ' 

where V{x\a) is the likelihood of event x given the model parameters a and V{a) is the prior 
probability density. For our application the WMAP a denotes a set of cosmological parameters 
(e.g., for the standard, flat ACDM model these could be, the cold-dark matter density parameter 
f^c, the baryon density parameter $7^, the spectral slope ng, the Hubble constant -in units of 100 
km s~^ Mpc~^)- h, the optical depth r and the power spectrum amplitude A), and event x will be 
the set of observed Q. 

The MCMC generates random draws (i.e. simulations) from the posterior distribution that are 

a "fair" sample of the likelihood surface. From this sample, we can estimate all of the quantities of 
interest about the posterior distribution (mean, variance, confidence levels). The MCMC method 
scales approximately linearly with the number of parameters, thus allowing us to perform likelihood 
analysis in a reasonable amount of time. 

A properly derived and implemented MCMC draws from the joint posterior density 'P(a\x) 
once it has converged to the stationary distribution. The primary consideration in implementing 
MCMC is determining when the chain has converged. After an initial "hum-in" period, all further 
samples can be thought of as coming from the stationary distribution. In other words the chain 
has no dependence on the starting location. 

Another fundamental problem of inference from Markov chains is that there are always areas 
of the target distribution that have not been covered by a finite chain. If the MCMC is run for a 
very long time, the ergodicity of the Markov chain guarantees that eventually the chain will cover 
all the target distribution, but in the short term the simulations cannot tell us about areas where 
they have not been. It is thus crucial that the chain achieves good "mixing". If the Markov chain 
does not move rapidly throughout the support of the target distribution because of poor mixing, it 
might take a prohibitive amount of time for the chain to fully explore the likelihood surface. Thus it 
is important to have a convergence criterion and a mixing diagnostic. Plots of the sampled MCMC 
parameters or likelihood values versus iteration number are commonly used to provide such criteria 
(left panel of Figure 3). However, samples from a chain are typically serially correlated; very high 
auto-correlation leads to little movement of the chain and thus makes the chain to "appear" to 
have converged. For a more detailed discussion see Gilks et al. (1996). Using a MCMC that has 
not fully explored the likelihood surface for determining cosmological parameters will yield wrong 
results. We describe below the method we use to ensure convergence and good mixing. 
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3.2. Convergence and Mixing 

We use the method proposed by Gelman & Rubin (1992) to test for convergence and mixing, 
They advocate comparing several sequences drawn from different starting points and checking to 
see that they are indistinguishable. This method not only tests convergence but can also diagnose 
poor mixing. For any analysis of the WMAP data, we strongly encourage the use of a convergence 
criterion. 

Let us consider M chains (the analyses in Spergel et al. (2003) and Peiris et al. (2003) use 

4 chains unless otherwise stated) starting at well-separated points in parameter space; each has 
2N elements, of which we consider only the last N: {y^} where i = 1, .., and j = 1, ..,M, i.e. y 
denotes a chain element (a point in parameter space) the index i runs over the elements in a chain 
the index j runs over the different chains. We define the mean of the chain 

1 ^ 

y' = l^E^' (20) 



N 

i=l 



and the mean of the distribution 



NM 

i 

y 



y y] ■ (21) 

NM ^ ' 

lj=l 



We then define the variance between chains as 

M 



and the variance within a chain as 



= M^T E(5' - S)'. (22) 



The quantity 

V!»:+Mi±i) (24) 

is the ratio of two estimates of the variance in the target distribution: the numerator is an estimate 
of the variance that is unbiased if the distribution is stationary, but is otherwise an overestimate. 
The denominator is an underestimate of the variance of the target distribution if the individual 
sequences did not have time to converge. 

The convergence of the Markov chain is then monitored by recording the quantity R for all 
the parameters and running the simulations until the values for R are always < 1.1. Gelman (Kaas 
et al. 1997) suggest to use values for R < 1.2. Here, we conservatively adopt the criterion R < 1.1 
as our definition of convergence. We have found that the four chains will sometimes go in and out 
of convergence as they explore the likelihood surface, especially if the number of points already 
in the chain is small. To avoid this, one could run many chains simultaneously or run one chain 



- 12 - 



for a very long time (e.g., Panter et al. (2002)). Due to CPU-time constraints, we run four chains 
until they fulfill both of the following criteria a) they have reached convergence, and b) each chain 
contains at least 30,000 points. In addition to minimizing chance deviations from convergence, we 
find that this many points are needed to be able to robustly reconstruct the 1-and 2-a levels of the 
marginalized likelihood for all the parameters. For most chains, the burn-in time is relatively rapid, 
so that we only discard the first 200 points in each chain, however the results are not sensitive to 
this procedure. 

3.3. Markov Chains in Practice 

In this section we explain the necessary steps to run a MCMC for the CMB temperature 
power spectrum. It is straightforward to generalize these instructions to include the temperature- 
polarization power spectrum and other datasets. The MCMC is essentially a random walk in 
parameter space, where the probability of being at any position in the space is proportional to the 
posterior probability. 

Here is our basic approach: 

1) Start with a set of cosmological parameters {ai}, compute the C\ and the likelihood L\ = 

2) Take a random step in parameter space to obtain a new set of cosmological parameters {02}. 
The probability distribution of the step is taken to be Gaussian in each direction i with r.m.s 
given by cTj. We will refer below to crj as the "step size". The choice of the step size is 
important to optimize the chain efficiency (see §3.4.2) 

3) Compute the C^*^ for the new set of cosmological parameters and their likelihood £.2- 

4.a) If L2I ll\ > 1, "take the step" i.e. save the new set of cosmological parameters {0:2} as part 
of the chain, then go to step 2 after the substitution {ai} — > {02}- 

4.b) If £,2! C\ < 1, draw a random number x from a uniform distribution from to 1. If a; > C2I £\ 
"do not take the step", i.e. save the parameter set {oi} as part of the chain and return to 
step 2. If X < £2/'^!, " take the step", i.e. do as in 4. a). 

5) For each cosmological model run four chains starting at randomly chosen, well-separated 
points in parameter space. When the convergence criterion is satisfied and the chains have 
enough points to provide reasonable samples from the a posteriori distributions (i.e. enough 
points to be able to reconstruct the 1- and 1-a levels of the marginalized likelihood for all the 
parameters) stop the chains. 

It is clear that the MCMC approach is easily generalized to compute the joint likelihood of WMKP 
data with other datasets. 
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3.4. Improving MCMC Efficiency 

The Markov chain efficiency can be improved in different ways. We have tuned our algorithm 
by reparameterization and optimization of the step size. 



3.4- 1- Reparameterization 

Degeneracies and poor parameter choices slow the rate of convergence and mixing of the Markov 
Chain. There is one near-exact degeneracy (the geometric degeneracy) and several approximate 
degeneracies in the parameters describing the CMB power spectrum (Bond et al. 1994; Efstathiou 
& Bond 1999). The numerical effects of these degeneracies are reduced by finding a combination 
of cosmological parameters (e.g., $7c, ^^5, h, etc.) that have essentially orthogonal effects on the 
angular power spectrum. The use of such parameter combinations removes or reduces degeneracies 
in the MCMC and hence speeds up convergence and improves mixing, because the chain does 
not have to spend time exploring degeneracy directions. Kosowsky et al. (2002) introduced a set 
of reparameterizations to do just this. In addition, these new parameters reflect the underlying 
physical effects determining the form of the CMB power spectrum (we will refer to these as physical 
parameters). This leads to particularly intuitive and transparent parameter dependencies of the 
CMB power spectrum. 

Following Kosowsky et al. (2002), we use a core set of six physical parameters. There are 
two parameters for the physical energy densities of cold dark matter, lOc = 0,ch^, and baryons, 
ujj, = Qiih?. There is a parameter for the characteristic angular scale of the acoustic peaks, 

where a^ec is the scale factor at decoupling, 

rs{aaec) = ^[1 + ((1 - ^)x^ + ^Ax'-'"" + ^mX + ^rad) 

is the sound horizon at decoupling, and 



dx (26) 



dA{adec) = / [(1 - ^)X^ + ^KX^-^"" + ^mX + ^rad] dx (27) 

is the angular diameter distance at decoupling, where Hq denotes the Hubble constant and c is the 
speed of light. Here 0,^ = Clc + 0,1,, Q,\ denotes the vacuum energy density parameters, w is the 
equation of state of the dark energy component, Q = $7^ + J7a and the radiation density parameter 
Orad = ^-y + Qu, ^'y, are the the photon and neutrino density parameters respectively. For 
reionization we use the physical parameter Z = exp(— 2r) where r denotes the optical depth to the 
last scattering surface (not the decoupling surface). The remaining two core parameters are the 
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spectral slope of the scalar primordial density perturbation power spectrum, n^, and the overall 
amplitude of the primordial power spectrum A. Both are normalized at A; = 0.05 Mpc~^ {£ ^ 700). 

For more complex models we add other parameters as described in Spergel et al. (2003) and 
Pciris et al. (2003) and in §5. To investigate non-flat models we use the vacuum energy, uj\ = ^jji^. 
Other examples include the tensor index, n^, the tensor to scalar ratio, r, and the running of the 

scalar spectral index, dus/dlnk. 

Here, we relate the input parameter for the overall normalization, A, as in the CMBFAST code 
(version 4.1 with UNNORM option), to the amplitude of primordial comoving curvature perturba- 
tions n, A^{ko) = (A;■V27r2)(|7^p). We also relate our convention for the tensor perturbations to 
the one in the code. CMBFAST calculates 

= i^n)TSj^Alik)[gUk)]\ (28) 

= (4Tr)To^|^A2(fc)[5^,(fc)]', (29) 

where is the Newtonian potential, gxiik) is the radiation transfer function, and Tq = 2.725 x 10^ 
is the CMB temperature in units of ^uK. The tilde denotes that A|(fc) is used in CMBFAST, but 
differs from our convention, Af^(k), where A| = A^/16. The comoving curvature perturbation, TZ, 
is related to * by * = -(3/5)7^; thus, A^(/c) = (25/9)A|(A;). Note that this relation holds from 
radiation domination to matter domination with accuracy better than 0.5%. 

CMBFAST uses A to parameterize A^(fco)- The tensor perturbations are calculated accord- 
ingly. The relations are 

A|(fco) = (30) 
Aliko) = ^A2(feo) = ^Al(feo) = g^Al(feo). (31) 



Therefore, one obtains 



Aniko) = 2.95 x 10'^ A. (32) 



The amplitude A is normalized at ki = 0.05 Mpc ^ and the tensor to scalar ratio r is evaluated 
at ko = 0.002 Mpc~^, unless otherwise specified. To convert A{ko) to A{ki), we use 

(7 \ ns(A;o)-l+|(dri.s/dlnfc)ln(fei/feo) 
■ ^^^^ 



3.4-2. Step Size Optimization 



The choice of the step size in the Markov Chain is crucial to improve the chain efficiency and 
speed up convergence. If the step size is too big, the acceptance rate will be very small; if the step 
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size is too small the acceptance rate will be high but the chain will exhibit poor mixing. Both 
situations will lead to slow convergence. For our initial step sizes for each parameter we use the 
standard deviation for each parameter when all the other parameters arc held fixed at the maximum 
likelihood value. These are easy to find once a preliminary chain has been run and the likelihood 
surface has been fitted, as explained in §3.4.3. If a given parameter is roughly orthogonal to all 
the other parameters, it is not necessary to adjust the step size further; in the presence of severe 
degeneracies the step size estimate needs to be increased by a "banana correction" factor which is 
approximately the ratio of the projection of the 1-a error along the degeneracy to the projection 
perpendicular to the degeneracy. 

With these optimizations the convergence criterion is satisfied for the 4 chains after roughly 
30,000 steps each (2A^ = 30, 000) for a model with 6 parameters. On a Origin 300 machine this 
takes roughly 32 hrs running each chain on 8 processors. These numbers serve only as a rough 
indication: convergence speed depends on the model and on the data-set: for a fixed number of 
parameters, convergence can be significantly slower if there are severe degeneracies among the 
parameters; adding more datasets might slow down the evaluation of a single step in the chain, but 
can also speed up convergence by breaking degeneracies. 



3.4-3. Likelihood Surface Fitting 

The likelihood surface explored by the MCMC was found to be functionally well approximated 
by a quartic expansion of the cosmological parameters (for example, {a^} = {loi,, Uc, n, 6a, -Z , A}): 

y = log{ji:) = qo + J2qiSi+J2qpiSj+ J2 qfSiSj6k+ ^t^iWl- (34) 

i i<j i<j<k i<j<k<l 

Here q are fit coefficients and Si are related to the cosmological parameters via 6i = {ai — a^) / ai, 
where is the maximum-likelihood value of the parameter. Lower-order expansions were unable 
to reproduce the likelihood surface. With 6 parameters there are Mf = 210 fit coefficients. Writing 
(34) as y = q ■ X, the minimum least-squares estimator for q is 

q= {X^Xr'X^y, (35) 

(i) 

where X is the A'^ x Mf matrix Xij = Xj , N the number of unique points in the chain. 

We run preliminary MCMC chains with "guesstimatcd" step sizes until there are ~ 1000 unique 
points in total. Then we use equation 34 to cut through the likelihood surface at the maximum 
likelihood value to find the la level in each parameter direction (see § 3.4.2). This defines our "step 
size" for subsequent chains. 
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3.5. The Choice of Priors 

From Bayes' Theorem (equation 19) we can infer V{ai\x), the probabihty of the model param- 
eters ctj given the event x (i.e. our observation of the power spectra), from the Hkehhood function 
once the prior is specified. It is reasonable to take prior probabilities to be equal when nothing is 
known to the contrary (Baycs' postulate). Unless otherwise stated we assume uniform priors on 
the parameters given in Tabic 1. Note that we assume uniform priors on Uc, cob, and 9a rather than 
uniform priors on O^, and Hq. 

Except for the priors on r, and w (the equation of state of the dark energy component), the 
MCMCs never hit the imposed boundaries, thus most of our choices for priors have no effect on 
the outcome. A detailed discussion about the prior on r is presented in Spergel et al. (2003). We 
set lower bound on w at —3.2 (—1.2) but we discard the region of parameter space where u; < — 3 
{w < —1). This is necessary because our best-fit value for this parameter is close to the boundary. 
If we had instead set the prior to he w > —3 {w > —1), then the chains would fail to be a fair 
representation of the posterior distribution in the region of parameter space where the distance 
from the boundary is comparable to the stepsize. 

3.6. MCMC Output Analysis 

We merge the 4 converged MCMC chains (> 120,000 points) into one. From this we give the 
cosmological parameters that yield our best estimate of Q and we give the marginalized distribution 
of the parameters. We compute the marginalized distribution for one parameter, and the joint 
distribution for two parameters, obtained marginalizing over all the other parameters. Since the 
MCMC passes objective tests for convergence and mixing, the density of points in parameter space 
is proportional to the posterior probability of the parameters. 

The marginalized distribution is obtained by projecting the MCMC points. For the marginal- 
ized parameters values o-j, Spergel et al. (2003) quote the expectation value of the marginalized 
likelihood, J Caidai = ^/N ^^at^i- Here, N is the number of points in the merged chain and at^i 
denotes the value of parameter aj at the t-th step of the chain. The last equality becomes clear 
if we consider that the MCMC gives to each point in parameter space a "weight" proportional to 
the number of steps the chain has spent at that particular location. The 100(l-2p)% confidence 
interval [cp, ci_p] for a parameter is estimated by setting Cp to the p*'* quantile of at^i, t = 1, . . . , N 
and Cp-i to the (1 — quantile. The procedure is similar for multidimensional constraints: the 
density of points in the n-dimensional space is proportional to the likelihood and multi-dimensional 
confidence levels can be found as illustrated in §15.6 of Press et al. (1992). 

We note that the global maximum likelihood value for the parameters does not necessarily 
coincide with the expectation value of their marginalized distribution if the likelihood surface is not 
a multi-variate Gaussian. We find that, for most of the parameters, the maximum likelihood values 
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Table 1. Priors for Bayesian Analyses 



Item 



< (Jc < 1 

< Wfe < 1 
0.005 <eA< 0.1 
< r < 0.3 
0.5 <A< 2.5 
< Uslko < 2 
< riiso < 2 
< /iso < 5000 
-0.5 < dn/dlnk < 0.5 
< r < 2.5 
< a;,. < 1 
-3.2(-1.2)^< w<(} 
< < 1 



'^We will present two 
sets of results, one with 
the prior w > —1.2 the 
other with w > —3.2 
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of the global joint fit are consistent with the expectation values of the marginalized distribution. 

A virtue of the MCMC method is that the addition of extra data sets in the joint analysis 
can efficiently be done with minimal computational effort from the MCMC output if the inclusion 
of extra data set does not require the introduction of extra parameters or does not drive the 
parameters significantly away from the current best fit. For example, we add Lyman a power 
spectrum constraint to MCMC's outputs, but we cannot do this for the 2dFGRS, since this requires 
the introduction of two extra parameters (/3, dp, see §5.1 below for more details). 

If the likelihood surface for a subset of parameters from an external (independent) data set is 
known, or if a prior needs to be added a posteriori, the joint likelihood surface can be obtained by 
multiplying the likelihood with the posterior distribution of the MCMC output. In Spergel et al. 
(2003) we follow this method to obtain the joint constraint of CMB with Supernovae la (Riess 
et al. 1998, 2001) data and CMB with Hubble Key project Hubble constant (Preedman et al. 2001) 
determination. 

There is yet another advantage of the MCMC technique. The current version of CMBFAST 
with the nominal interpolation settings is accurate to 1%, but random numerical errors can some- 
times exceed this. As the precision of the CMB measurements improve, these effects can become 
problematic for any approach that calculates derivatives as a function of parameters. Because 
MCMC calculations average over ~100,000 CMB calculations, the MCMC technique is much less 
sensitive than either grid-based likelihood calculations or methods that numerically calculate the 
Fisher matrix. 

4. EXTERNAL CMB DATA SETS 

The CBI (Mason et al. 2002; Sievers et al. 2002; Pearson et al. 2002) and the ACBAR (Kuo 
et al. 2002) experiments complement WMAP by probing the amplitude of CMB temperature power 
spectrum at £ > 900. These observations probe the Silk damping tail and improve our analysis in 
2 ways: a) improve our ability to constrain the baryon density, the amplitude of fluctuations and 
the slope of the matter power spectrum, and b) improve convergence by preventing the chains from 
spending long periods of time in large, moderately low-likelihood regions of parameter space. 

The CBI data set is described in Mason et al. (2002), Pearson et al. (2002) and on their web 
site^^. We use data from the CBI mosaic data set (Pearson et al. 2002) and do not include the deep 

data set as the two data sets arc not independent. We use the three bandpowers from the even 
binning at central i values of 876, 1126, 1301, thus ensuring that the chosen bands power can be 
considered independent from the WMAP data. At i > 1500, the CBI experiment detected excess 
power. If the rms amplitude of mass fluctuations on scales of 8 h^^ Mpc, is ag ~ 1, then this excess 



http://www.astro.caltech.edu/~tjp/CBI/data/index.html (last update August 2002) 
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power can be interpreted as due to Sunayev-Zeldovich distortion from undetected galaxy clusters 
(Mason et al. 2002; Bond et al. 2002; Komatsu & Seljak 2002). We simplify our analyses by not 
using the CBI data on scales where this effect can be important. The correlations between different 
band powers are taken into account with the full covariance matrix; we use the lognormal form 
of the likelihood (as in Pearson et al 2002). In addition, we marginalize over a 10% calibration 
uncertainty (CBI beam uncertainties are negligible). 

The ACBAR data set is described in Kuo et al. (2002). We use the 7 band-powers at multipoles 
842, 986, 1128, 1279, 1426, 1580, 1716. As shown in Figure 4, these points do not overlap with 
the WMAP power spectrum except at £ ~ 800 where WMAP is noise-dominated. As shown 
in Figure 4 the ACBAR experiment is less sensitive to Sunyaev-Zel'dovich contamination than 
CBI. We compute the likelihood analysis for cosmological parameters for the ACBAR data set 
following Goldstein et al. (2002) and using the error bars given in ACBAR web site^^. In addition 
we marginalize over conservative beam and calibration uncertainties (B. Holzapfel 2002, private 
communication). In particular we assume a calibration uncertainty of 20% (the double of the 
nominal value) and 5% beam uncertainty (60% larger than the nominal value) . 

The ACBAR and CBI data are completely independent from each other (they map different 
regions of the sky) and from the WMAP data (the band-powers we consider span different i ranges) . 
To perform the joint likelihood analysis, we simply multiply the individual likelihoods. 

5. ANALYSIS OF LARGE SCALE STRUCTURE DATA 

We can enhance the scientific value of the CMB data from z ~ 1089 by combining it with 
measurements of the low redshift universe. Galaxy redshift surveys allow us to measure the galaxy 
power spectrum at z ~ and observations of Lyman a absorption of about 50 quasar spectra 
(Lyman a forest) allow us to probe the dark matter power spectrum at redshift z ^ 3. 

We use the Anglo-Australian Telescope Two Degree Field Galaxy Redshift Survey (2dFGRS) 
(Colless et al. 2001) as compiled in February 2001. This survey probes the universe at redshift 
ZeS ~ 0.1 and probes the power spectrum on scales corresponding to 0.022 < A; < 0.2 (where k is in 
units of h Mpc~^. The anticipated Sloan Digital Sky Survey (Gunn & Knapp 1993) power spectrum 
will be an important complement to 2dFGRS. We also use the linear matter power spectrum 
as recovered by Croft et al. (2002) from Lyman a forest observations. This power spectrum is 
reconstructed at an effective redshift z ~ 2.72 and probes scales k > 0.2 h Mpc~^. Together these 
data sets allow us not only to probe a wide range of physical scales — from A; ~ 1 x 10~^ (30000 
Mpc h~^) to A; ~ 1 (3 Mpc h~^) — (see Figure 5), but also to probe the evolution of a given scale 
with redshift as well. 



http: / / cosmology.berkeley.edu/group/swlh/acbar/data/ 
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When including LSS data sets one should keep in mind that the underlying physics for these 
data sets is much more complicated and less well understood than for WMAP data, and systematic 
and instrumental effects are much more important. We attempt here to include all the known 
(up to date) uncertainties and systematics in our analysis. In what follows, we illustrate our 
modeling of the "real-world" effects of LSS surveys and how we propagate systematic and statistical 
uncertainties into the parameters estimation. The goal of our modeling is to relate not just the 
shape but also the amplitude of the observed power spectrum to that of the linear matter power 
spectrum as constrained by CMB data. The reason for this will be clear in §5.1.3; by using 
the information in the power spectrum amplitude we can break some of the degeneracies among 
cosmological parameters. 

5.1. The 2dFGRS Power Spectrum 

The 2dFGRS power spectrum, as released in June 2002, has been calculated from the February 
2001 catalog that includes 140,000 galaxies (Percival et al. 2001). The full survey is composed of 
220,000 galaxies but is not yet available. The sample is magnitude-limited at bj = 19.45 and thus 
probes the universe at ~ 0.1 and the power spectrum on scales corresponding to k > 0.015 
h Mpc^^. The input catalog is an extended version of the Automatic Plate Machine (APM) 
galaxy catalog (Maddox et al. 1990b, a, 1996) which includes about 5 million galaxies to bj = 20.5. 
The APM catalog was used previously to recover the 3-d power spectrum of galaxies by inverting 
the clustering properties of the 2-d galaxy distribution (Baugh &; Efstathiou 1993; Efstathiou & 
Moody 2001). These techniques, however, are affected by sample variance and uncertainties in the 
photometry; a full 3-d analysis is thus more reliable. 

The power spectrum of the galaxy distribution as measured by LSS surveys, such as the 
2dFGRS, cannot be directly compared to that of the initial density fluctuations as predicted by 
theory, or recovered from WMAP or, the combination of WMAP+CBI+ACBAR data-sets. This 
is due to a number of intervening effects that can be broadly divided in two classes: effects due to 
the survey geometry (i.e., window function, selection function effects) and effects intrinsic to the 
galaxy distribution (e.g., redshift-space distortions, bias, non-linearities). 

5.1.1. Survey Geometry 

Galaxy surveys such as the 2dFGRS are magnitude-limited rather than volume-limited, thus 
most nearby galaxies are included in the catalog while only the brighter of the more distant galaxies 
are selected. The selection function accounts for the fact that fewer galaxies are included in the 
survey as the distance (or the rcdshift) increases. An additional effect arises from the fact that the 
clustering properties of bright galaxies might be different from the average clustering properties of 
the galaxy population as a whole. The selection function does not take this into account (we will 
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return to this point in §5.1.2). 

Moreover, the completeness across the sky is not constant and the survey can only cover a 
fraction of the whole sky, sometimes with a very complicated geometry described by the window 
function. In particular, for the data we use, unobserved fields make the survey completeness a 
strongly varying function of position. The measured Fourier coefficients are therefore the true 
coefficients of the galaxy distribution convolved by the Fourier transform of the selection function 
(in the direction of the line of sight) and of the window function (on the plane of the sky). In this 
section, we follow the standard notation used in LSS analyses and refer to all of these effects as 
window effects. 

The window not only modifies the measured power spectrum but also introduces spurious 
correlations between Fourier modes. (See Percival et al. (2001) for more details). For the 2dFGRS 
these effects have been quantified by Monte Carlo simulations of mock catalogs of the survey^^ . We 
include them in our analysis by convolving the theory power spectrum with the window "kernel" , 
and by including off diagonal terms in the covariance matrix. 

5.1.2. Effects Intrinsic to the Galaxy Distribution 

Linear gravitational evolution modifies the amplitude but not the shape of the underlying power 
spectrum. However, in the non-linear regime (where the amphtude of fluctuations is 6p/p ~ 1) 
this is no longer the case. Non-linear gravitational evolution changes the shape of the power 
spectrum and introduces correlations between Fourier modes. This effect becomes important on 
scales k ~ 0.1 h Mpc~^, but the exact scale at which it appears and its detailed characteristic 
depend on cosmological parameters. Most of the clustering signal from galaxy surveys such as 
2dFGRS comes from the regime where non-linearities are non-negligible because shot noise is the 
dominant source of error at A; > 0.5 h Mpc~^ and the number density of modes scales as A;^. These 
non linearities encode additional information about cosmology and motivates their inclusion in the 
present analysis. This approach is complicated by the fact that an accurate description of the 
fully non-linear evolution of the galaxy power spectrum is complicated. In the literature, there are 
several different approaches to modeling the non-linear evolution of the underlying dark matter 
power spectrum in real space: (1) linear (and extended) perturbation theory; (2) semi-analytical 
modeling and (3) numerical simulation. All of these approaches yield consistent results on the 
scales used in our analysis. We will use the semi-analytical approach developed by Hamilton et al. 
(1991) and Peacock & Dodds (1996). In particular, wc use the Ma et al. (1999) formulation of 
the non-linear power spectrum. Figure 6 shows the effect of non-linearities on the matter power 
spectrum on the scales of interest (compare solid and dashed lines). 



^^For WMAP data, we deconvolve the raw measured Ci by the effect of the window (the mask), thus leaving the 
effect of the window function and the mask only in the fisher matrix. For LSS we will convolve the theory with the 
window, project the power spectrum into redshift space and compare this to the observed power spectrum. 
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Theory predicts the statistical properties of the continuous matter distribution, while obser- 
vations are concerned with the galaxy distribution, which is discrete. Moreover, galaxies might not 
be faithful tracers of the mass distribution (i.e. the galaxy distribution might be biased). In the 
analysis of galaxy surveys it is assumed that galaxies form a Poisson sampling of an underlying 
continuous field which is related to the matter fluctuation field via the bias. It is possible to for- 
mally relate the discrete galaxy field and its continuous counterpart. For the power spectrum, this 
consists of the subtraction from the measured galaxy power spectrum of the shot noise contribTition. 
The published power spectra from galaxy surveys already have this contribution subtracted, but 
are still biased with respect to the underlying mass power spectra. 

The idea that galaxies are biased tracers of the mass distribution even on large scales was 
introduced by Kaiser (1984) to explain the properties of Abell clusters. Nevertheless, the fact 
that galaxies of different morphologies have different clustering properties (hence different power 
spectra) was known much before (e.g., Hubble (1936); Dressier (1980); Postman k, Geller (1984)). 
Since the clustering properties of different types of galaxies are different, they cannot all be good 
tracers of the underlying mass distribution^^ . 

In the simplest biasing model, the linear bias model, the mass and galaxy fractional overdensity 
fields 6 and 5g are related by (5p(x) = 65 (x). This implies that on all scales 

P,(fc) = b^P{k). (36) 

This simple model (although justified by the Kaiser (1984) assumption that galaxies form on the 
highest peaks of the mass distribution) cannot be true in detail for two reasons. The first is 
that, on a fundamental level, the galaxy fluctuation field on small smoothing scales could become 
5g < —1 which corresponds to a negative galaxy density. The second is that, from an observational 
point of view, this scheme leaves the shape of the power spectrum unchanged while not all galaxy 
populations have the same observed power spectrum shape, although the differences are not large 
(e.g., Peacock & Dodds (1994); Norberg et al. (2001)). Many different and more complicated biasing 
schemes have been introduced in the literature. For our purposes it is important to note that the 
bias of a sample of galaxies depends on the sample selection criteria and on the weighting scheme 
used in the analysis. Thus different surveys will have different biases, and care must be taken when 
comparing the different galaxy power spectra. 

There are several indications that large-scales galaxy bias is scale independent on large scales 
(e.g., Hoekstra et al. (2002); Verde et al. (2002)). This justifies adopting equation 36. For the 
2dFGRS, the bias of galaxies has been measured by Verde et al. (2002), by using higher-order 
correlations of the galaxy fluctuation field. They assume a generalization of the simple linear 
biasing scheme, 5g = b\5 + b2/25'^ . They find no evidence for scale-dependent bias at least on linear 



^ Galaxies are likely to be formed in the very high-density regions of the matter fluctuation field, thus they are 
formed very biased at 2; >> (e.g., Lyman break galaxies). But then gravitational evolution should make the galaxy 
distribution less and less biased as time goes on (e.g., Pry (1996)) 
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and mildly non-linear scales (i.e. k < 0.4 h Mpc~^) and 62 consistent with 0. This finding further 
supports the use of equation 36. In particular, they find bi = 1.06 it 0.11. In our analysis we will 
assume linear biasing. 

The Verde et al. (2002) bias measurement has to be interpreted with care. It applies to 
2dFGRS galaxies weighted with a modification of the Feldman et al. (1994) weighting scheme as 
described in Percival et al. (2001). It is important to note that, close to the observer, dim galaxies 
are included in the survey; the galaxy density is high, but a small volume of the sky is covered. On 
the other hand, far away from the observer, only very bright galaxies are included in the survey; a 
large volume is probed but the galaxy density is low. As a consequence, clustering of dim galaxies 
in a small volume close to the observer contain most of the signal for the power spectrum at small 
scales. While rare, bright galaxies in a large volume enclose most of the information about the 
power spectrum on large scales. An "optimal" weighting scheme would thus weight dim galaxies 
on small scales and bright galaxies on large scales. This weighting scheme is, unfortunately, biased. 
Bright galaxies are more strongly clustered (i.e. more biased) than dim ones. This effect is known as 
"luminosity bias" . The power spectrTim recovered from such a weighting scheme will have optimal 
error bars, but will exhibit scale-dependent bias. The weighting scheme used in Percival et al. 
(2001) is not optimal, but is virtually unaffected by luminosity bias (Percival, Verde &: Peacock 
2003). The power spectrum so obtained is that of 2 galaxies on virtually all scales, and the 
effective redshift for the power spectrum is z^s = 0.17, slightly larger than the effective redshift of 
the survey as defined by the selection function (Percival et al. 2001; Peacock et al. 2001). 

The final complication is that galaxy catalogs use the redshift as the third spatial coordinate. 
In a perfectly homogeneous Friedman universe, redshift would be an accurate distance indicator. 
Inhomogeneities, though, perturb the Hubble flow and introduce peculiar velocities. As Kaiser 

(1987) emphasized, the peculiar velocities distort the clustering pattern not only on small scales 
where virialized objects produce "Fingcrs-of-God" , but also on large scales where coherent flows 
produce large scale distortion components. 

On large (linear) scales the rcdshift-space effect on an individual Fourier component of the 
density fluctuation field 6^ can be modeled by. 



where the superscript s refers to the quantity in redshift space and /j, is the cosine of the angle 
between the /c-vector and the line of sight. The Kaiser factor, /?, is the linear redshift space distortion 
parameter. One defines (3 = f/b, where / = d\nS/d\na, with 6 = Sp/p and a = (1 + z)~^; b is the 
linear bias parameter. The expression for f{z) is a known function of $7^, A and z (Lahav et al. 



5k 



4(l + /3/i'), 



(37) 



1991), 




1 -1 



f{nrn,A,z) = x 



-1 



A 



(38) 



{l + zf 



-24- 

where X = 1 + fi^z + A(a^ — 1), and can be approximated by^° P ~ ^}^/b. The analysis of 
the 2dFGRS (Peacock et al. 2001; Percival et al. 2001) constrains / at the effective redshift of the 
survey. The effective redshift of the survey depends on the galaxy weighting scheme adopted to 
compute the power spectrum for the above work (zes ~ 0.17). This pecuhar velocity infall causes 
the over density to appear squashed along the line of sight. The net effect on the angle-averaged 
power spectrum in the small angle approximation is 

P%k)=Pik) (^1 + 1/3+ ^(3^^ 

Thus on large scales the redshift space distortions boost the power spectrum if /3 > 0. 

On smaller scales, virialized motions produce a radial smearing and the associated "Fingers- 
of-God" effect contaminates the wavelengths we are interested in. This is difficult to treat exactly, 
but as it is a smearing effect, it produces a mild damping of the power, acting in the opposite 
direction to the large-scale boosting by the Kaiser effect (see for example Matsubara (1994)). On 
these scales, the redshift space correlation function is well modeled as a convolution of the real 
space isotropic correlation function with some distribution function for the line of sight velocities 
(e.g. Davis & Peebles (1983); Cole et al. (1994); Fisher (1995)). Since the convolution in real space 
is equivalent to multiplication in Fourier space, the redshift space power spectrum on small scales is 
multiplied by the square of the Fourier transform of the velocity distribution function (e.g.. Peacock 
& Dodds (1994)), 

P'{k, n) = P{k){l + pfi''fD{kapn) , (40) 

where ap denotes the line-of-sight pairwise velocity dispersion. If the pairwise velocity distribution 
is taken to be an exponential (e.g. Ballinger ct al. (1995. 1996); Hatton &; Cole (1998)) which 
seems to be supported by simulations (e.g., Zurck ct al. (1994)) and observations (e.g., Marzke 
et al. (1995)), then the damping factor is a Lorentzian (see also Kang et al. (2002)), 

^^^""^^^ = l + (Pa2^2)/2 

We adopt this functional form as it is used by Peacock et al. (2001) in determining the redshift 
space distortion parameters f3 and ap from the 2dFGRS. The overall effect for the power spectrum 
in a thin shell in A;-space is given by 

Pik) (42) 

obtained by averaging over /j, in equation (40) with the damping factor given by equation (41). 
Figure 6 show the effect of redshift space distortions (equation 42) on the scales of interest. 



In our analysis we use the exact expression for /? as in equation(38). 



(39) 



(41) 
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This model is simplistic for several reasons. The most important is that, because of the com- 
plicated geometry of the survey, the simple angle average operation performed to obtain equation 
(42) might not be strictly correct. Also, equation (42) is obtained in the plane-parallel (also known 
as small-angle) approximation (i.e. as if the lines of sight to different galaxies on the sky were 
parallel) . 

We have performed extensive testing of equation (42) using mock 2dFGRS catalogs obtained 
from the Hubble volume simulation. We find that the simulations redshift-space power spectrum 
is consistent, given the errors, with equation (42) where P{k) is the simulations real-space power 
spectrum up to < 0.4 h Mpc"-*^, even for the complicated geometry of the 2dFGRS. This means 
that up to ~ 0.4 the systematics introduced by eq. (42) are smaller than the statistical errors; in 
the analysis we use only k < 0.15." However, the value for /3 in equation (42) needs to be calibrated 
on Monte Carlo realizations of the survey. We find that = 0.85/3. We have verified that our 
results for the cosmological parameters are insensitive to the exact choice of the correction factor. 
Peacock et al. (2001) measured the parameters /? and Up and their joint probability distribution 
from the survey obtaining (3 = 0.43 and ap = 385 km s^"^. This measurement has been obtained 
by using the full angular dependence of the power spectrum and therefore recovers directly /? and 
not P'^^ . Hawkins et al. (2002) measured these parameters from a larger sample than the one from 
Peacock et al. (2001), obtaining a slightly different result. This is mostly due to a shift in the 
recovered value for ap. Since most of the galaxies in the Hawkings et al. (2002) sample are in 
the Peacock et al. (2001) sample, we conservatively extend our error-bars on (3 and a by 10% and 
30% respectively, to include the new value within the l-a marginalized confidence contour, and to 
include a possible error in the determination of Z?*^^. Figure 6 illustrates the importance of including 
all the above effects in our analysis. 

In our analysis we consider data in the k range 0.02 < k < 0.2 h Mpc~^. On large scales the 
limit is set by the accuracy of the window function model; on small scales the limit is set by where 
the covariance matrix has been extensively tested. In this regime we also have a weak dependence 
on the velocity dispersion parameter ap, the parameter with the largest systematic uncertainty. 

5.1.3. Motivation for this Modeling 

The motivation behind the complicated modeling of §5.1.1-5.1.2 is to be able to infer the 
amplitude of the matter power spectrum from the observed galaxy clustering properties. 

Figures 7 and 8 illustrate how the modeling of §5.1.1-5.1.2 helps in breaking degeneracies 
among cosmological parameters. For illustration, we consider two cases below; the degeneracy of 
the dark energy equation of state, w, (Huey et al. 1999) with flf, and Ug and the ujn — h degeneracy, 
where = Qi,h^. 

Figure 7 shows two models that are virtually indistinguishable with CMB data, but which 
predict different amplitudes for the matter power spectra at z ~ 0. This is because the linear 
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growth factor and the shape parameter F are different for the two cases. The two models differ 
in the values of ujb, Ug and w. The solid line is a model with w = —0.4 while the dotted line is a 
model with w = —1. 

In Figure 8 we show two sets of cosmological parameters that differ only in the values of 
the neutrino mass and the Hubble constant. These two models are virtually indistinguishable 
with CMB observations. But the matter power spectrum in the two cases is different in shape 
and amplitude. Since redshift-space distortions and window function affect the power spectrum 
shape, extra information about cosmological parameters is encoded in its amplitude. By using this 
information, Spergel et al. (2003) obtain a cosmological upper bound on the neutrino mass that is 
~ 4 times better than current cosmological constraints (Elgar0y et al. 2002). 

For completeness, we have shown the power spectrum also for scales probed by the Lyman 
a forest (see §6). The error bars in Figures 7 and 8 are examples of the size of the 2dFGRS and 
Lyman a power spectra statistical uncertainties in one data-point, showing that the two models 
can be distinguished if the observed power spectrum can be related to the linear matter power 
spectrum without introducing large additional uncertainties. 

5. 1.4.. Practical Approach 

The procedure we adopt in order to compare the observed galaxy power spectrum with the 
theory predictions is outlined below (the published 2dFGRS galaxy power spectrum has been 
already corrected for shot noise). For a given set of cosmological parameters and a pair- wise 
velocity dispersion parameter we proceed as follows: 

1) The MCMC selects a set of cosmological parameters and values for j3 and Gp. CMBFAST 
computes the theoretical linear matter power spectrum at z = 0. 

2) We evolve the theoretical linear matter power spectrum to obtain the non-linear matter power 
spectrum at the effective redshift of the survey, following the prescription of Ma et al. (1999). 

3) We then obtain the redshift-space power spectrum for the mass by using equation 42 with 
(3"^^ calibrated from Monte Carlo realizations of the catalog. 

4) The bias is computed from (3 and $7^ using equation (38). The galaxy power spectrum is 
obtained by correcting for bias, equation 36. 

5) The resulting power spectrum is convolved with the galaxy window function. We use the 
routine provided on the 2dFGRS web site to perform this numerically. This is the power 
spectrum that can be compared with the quantity measured from a galaxy survey. 

6) We can now evaluate the likelihood using the full covariance matrix as provided by the 2dF- 
GRS team. We approximate the likelihood to be Gaussian as it was done by the team. In 
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principle this is not strictly correct since in the linear regime the power spectrum is an ex- 
ponential distribution and in the non-linear regime the distribution has contributions from 
higher-orders correlations. However, due to the size of the survey we arc considering, the 
central limit theorem ensures that the likelihood is well described by the Gaussian approx- 
imation (e.g., Scoccimarro et al. (1999)). Moreover, the covariance matrix for the 2dFGRS 
power spectrum has been computed by the 2dFGRS team under the assumption that the 
likelihood is Gaussian. 

We assume that the likelihood for the bias parameter is Gaussian, centered on 6 = 1.04 with 
dispersion (j^ = 0.11. This is a conservative overestimate of the error on the bias parameter, as 
noted in Verde et al. (2002). The determination of b is correlated with (3 and ap and the error 
quoted has already been marginalized over the uncertainties in these two parameters. In practice, 
for each step in the Markov chain we compute the likelihood according to items 1 through 6 above. 
The bias parameter is determined once ap and the other cosmological parameters are chosen. 
We then multiply the likelihood by the joint likelihood for /? and ap, as in Figure 4 of Peacock et al. 
(2001), and by the likelihood for the bias parameter. In effect, we use the determination of /?, ap, 
and b as priors. By multiplying the likelihood we assume that the measurements of the redshift 
space distortion parameters, bias, and the 2dFGRS power spectrum are independent. We justify 
this assumption below. 

The parameters needed to map the real-space non-linear matter power spectrum onto the 
redshift-space galaxy spectrum are: P, ap and b. These three parameters are not independent, 
not only is f3 oc 1/6 but, more importantly, the three parameters are measured from the same 
catalog which we are using to constrain other cosmological parameters. However, the information 
we use to constrain cosmological parameters is all encoded in the shape and amplitude of the 
angle-averaged power spectrum. The information used to measure /? and ap is all encoded in the 
dependence of the Fourier coefficients (i.e., of the power spectrum) on the angle from the line of 
sight. Thus we can treat the determinations of P and ap as independent from the likelihood for 
cosmological parameters. The analysis of Verde et al. (2002) to measure the bias parameter from 
the 2dFGRS uses both information about the amplitude of the Fourier coefficients and their angular 
dependence. This dependence, however, is not that introduced by rcdshift-space-distortions, but 
is the configuration dependence of the bispectrum. Thus, in principle we should not treat this 
measurement as completely independent. However, most of the signal for the bias measurement 
comes from the fc-range of 0.2 < k < 0.4 h Mpc~^ while the signal for the present analysis comes 
from k < 0.2 h Mpc~^. Note that the configuration dependence of the bispectrum is largely 
independent of cosmology. This allows us to easily include a prior for the bias parameter in the 
analysis. 
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6. 



LYMAN a FOREST DATA 



The Lyman a forest traces the fluctuations in the neutral gas density along the line of sight 
to distant quasars. Since most of this absorption is produced by low density unshocked gas in the 
voids or in mildly overdense regions that are thought to be in ionization equilibrium, this gas is 
assumed to be an accTiratc tracer of the large-scale distribution of dark matter. In this epoch and 
on these scales the clustering of dark matter is still in the linear regime. 

Since the Lyman a forest observations are probing the distribution of matter at z ~ 3, they 
are an important complement to the CMB data and the galaxy surveys data. Because of their 
importance, there has been extensive numerical and observational work testing the notion that they 
trace the large-scale structure. In our analyses, we find that the addition of Lyman a forest data 
appear to confirm trends seen in other data sets and tightens cosmological constraints. However, 
more observational and theoretical work is still needed to confirm the validity of the emerging 
consensus that the Lyman a forest data traces the LSS. 

Recent papers use two different approaches for analysis of the Lyman a forest power spectrum 
data. McDonald et al. (2000) and Zaldarriaga et al. (2001) directly compare the observed trans- 
mission spectra to the predictions from cosmological models. We follow the approach of Croft et al. 
(2002) and Gnedin & Hamilton (2002) (GH) who use an analytical fitting function to recover the 
matter power spectrum from the transmission spectrum^^. 

GH factorize the linear power spectrum into four terms. 



where P^^^{k) is a quantity that is independent of modeling and is is almost directly measured. 
The other parameters convert this quantity into the linear matter power spectrum and encode the 
dependence on cosmology and the modeling of the inter galactic medium (IGM). In our treatment, 
we use the values of P^^^iji) (the estimator from Lyman a forest observations of p^*^"^*) from GH 
and their parameterization in terms of equation (43) because it allows us to explicitly include the 
dependence of the recovered linear matter power spectrum on the cosmological parameters. Qq 
encodes the dependence of the recovered linear power spectrum on the matter density parameter 
at ^; = 2.72. For Qq we use the GH ansatz of. 



Qt = 20, 000 K/Tq (To ~ 20, 000 K) parameterizes the dependence on the mean temperature of the 
IGM, Qt ~ 1.11 parameterizes the dependence on the assumed mean optical depth. In addition to 



■^^ After the present paper was submitted, a preprint appeared (Scljak et al. 2003) claiming that the treatement of 
GH and Croft et al. (2002) significantly underestimate the errors. Given the importance of this data set to tighten 
cosmological constraints, the Lyman a forest community should reach a consensus on the interpretation of these 
observations and on the level of systematic contamination 



PL{k) = P''^{k)QnQTQ 



T ) 



(43) 




(44) 
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the statistical errors, GH quote a systematic uncertainty that we add to the statistical one. Finally, 

the uncertainties in Qq, Qt and contribute to the overall normalization uncertainty. We use the 
Croft et al. (2002) prescription to parameterize this uncertainty as liiV{A) = —1/2[A — l)^/c"|,y^ 
where if A < 1 then aLy„ = 0.25 while if yl > 1, aiy^ = 0.29. 

N-body simulations are used to convert the flux power spectrum to the dark matter power 
spectrum and calibrate the form of equation (43). The two different groups, GH and Croft et al. 
(2002) , have done this independently. The resulting power spectra agree well within the 1-cr errors 
for all data points except the last three. We thus increase the la uncertainties on the last three 
data points to make the two determinations of Piik) consistent and use this as a rough measure 
of the intrinsic systematic uncertainties in the Lyman a data. 

GH point out that the correlation in flux measured from the Lyman a forest samples power 
over a finite band of wavenumbers. The effective band-power windows are rather broad due to 
the peculiar velocities that smear power on scales comparable to the 1-d velocity dispersion. Thus 
the recovered linear power spectrum is effectively smoothed with an window that becomes broader 
at smaller scales. In principle, the resulting covariance between estimates of power at different k 
needs to be taken into account to do a full likelihood analysis to extract cosmological parameters. 
However, the full covariance matrix is not available. Since the Lyman a data are such a powerful 
tool we just perform a simple fit and caution the reader that interpreting the reduced a-s a 
measure of goodness of fit for this data set is not meaningful since the data are strongly correlated. 

To marginalize over the overall normalization uncertainty, we take advantage of the MCMC 
approach. In principle we could marginalize over it analytically, as we do for the calibration 
uncertainty. Instead, at each step of the chain we compute the best fit amplitude A as done for 
point sources (Hinshaw et al. 2003b), 



A 



Y^iP°''%kf/al) 



(45) 



The likelihood for the Lyman a data for the model is given by InCiy^ = lrLC{P"^^\A, PL)+ln'P{A). 
The marginalization is then automatically obtained from the MCMC output. In other words, the 
analytic marginalization computes J 7'(data|model)7'(A)d^ while we compute an estimator of this 
given by / P(data|model)7^(A)dA. 



7. Conclusions 

In this paper, we have presented the basic formalism that we use for our likelihood analysis. 
This paper shows the final step on the path from time-ordered data to cosmological parameters. 
It provides the framework for the analysis of cosmological parameters and their implications for 
cosmology. 
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The unprecedented quality of the WMAP data and the tight constraints on cosmological 
parameters that arc derived require a rigorous analysis so that the approximations made in the 
modeling do not propagate into significant biases and systematic errors. Wc have derived an 
approximation to the exact likelihood function for the Cg which is accurate to better than 0.1%, 
and we have carefully calibrated the temperature power spectrum covariance matrix with Monte 
Carlo simulations. This enables us to use the effective chi-squared per degree of freedom as a tool 
to test whether or not a model is an acceptable fit to the data. 

We implement our likelihood analysis with the MCMC. We have concentrated on the issue of 
convergence and mixing, emphasizing how important these issues are in recovering cosmological 
parameters values and their confidence levels from the MCMC output. 

To the WMAP data-sets (TT and TE angular power spectra) we have added the CBI and 
ACBAR measurement of the CMB on smaller angular scales, the 2dFGRS galaxy power spectrum 
at z ~ 0, and the Lyman a forest matter power spectrum at z ~ 3. These external data sets 
significantly enhance the scientific value of the WMAP measurement, by allowing us to break 
parameter degeneracies. While the underlying physics for these data sets is much more complicated 
and less well understood than for WMAP data, and systematic and instrumental effects are much 
more important, we feel we have made a significant step towards improving the rigor of the analysis 
of these data sets. We have included a detailed modeling of galaxy bias, redshift distortions and 
the non-linear growth of structure. We also include known (as to the present day) systematic and 
statistical uncertainties intrinsic to these other data sets. 
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Fig. 1. — Ratio of the effective sky coverage to the actual sky coverage. This correction factor 
cahbrates the expression for the Fisher matrix to the value obtained from the Monte Carlo approach. 
Here we show the ratio obtained from 100,000 simulations (jagged line), the smooth curve shows 
the fit we use, equation (16). Note that, since we are switching between weighting schemes, the 
correction factors are not expected to smoothly interpolate between regimes. 
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Fig. 2. — Correction factor for the noise. The lines are as in Figure l.Note that, since we are 
switching between weighting schemes, the correction factors are not expected to smoothly interpo- 
late between regime 
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Fig. 3. — Unconverged Markov chains. The left panel shown a trace plot of the likelihood values 
versus iteration number for one MCMC (these are the first 3000 steps from one of our ACDM 
model runs). Note the burn-in for the first ~ 100 steps. In the right panel, red dots are points of 
the chain in the (n, A) plane after discarding the burn-in. Green dots are from another MCMC 
for the same data-set and the same model. It is clear that, although the trace plot may appear to 
indicate that the chain has converged, it has not fully explored the likelihood surface. Using either 
of these two chains at this stage will give incorrect results for the best fit cosmological parameters 
and their errors. 
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Fig. 4. — The CMB angular power spectrum (in //K^) for our best fit ACDM model for £ > 800 and 
the Sunayev-Zel'dovich contribution for ag = 0.98 for CBI wavelengths (dotted) and for ACBAR 
(dashed). The vertical line shows the adopted cutoff for CBI and ACBAR. 
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Fig. 5. — The combined CMB and LSS data set. The CMB angular power spectrum in /iK^ (top 
panel) as a function of k where k is related to ^ by £ = T]Qk (where 7]q ~ 14400 Mpc is the distance 
to the last scattering surface). Black points are the WMAP data, red points CBI, blue points 
ACBAR. The LSS data (bottom panel). Black points are the 2dFGRS measurements and green 
points are the Lyman a measurements. Both LSS power spectra are in units of (Mpc h~^)^ and 
have been rescaled to z = 0. This plot only illustrates the scale coverage of all the data sets we 
consider. The various LSS power spectra as plotted here cannot be directly compared with the 
theory because of the effects outlined in §5 (e.g., redshift-space distortions, non-linearities, bias and 
window function effect etc.). 
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Fig. 6. — The matter power spectrum in (Mpc h"^)"^), linear in real space (solid line), non-linear in 
real space (dashed line) and non-linear in redshift space (dotted line). The error bars on the dotted 
line show the size of the statistical error-bars per A;-bin of the 2dFGRS galaxy power spectrum. 
The power spectrum is in units of (Mpc h)^. 
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Fig. 7.— Two cosmological models: = 0.0235, w„ = 0.143, = 0.978 r = 0.11, w; = -0.426, h = 0.53 

(solid line) LUb = 0.0254, lu^ = 0.137, n,, = 1.024, t = 0.08 w = -1, h = 0.77 (dotted line). The two models 
are indistinguishable within current error-bars from the CMB angular power spectrum (left panel, units for 
the power spectrum are /xK^). However they can easily be distinguished if we can relate the observed power 
spectrum to the underlying matter power spectrum (right panel, units for the power spectrum are (Mpc 
h~^)^). The error bars on the solid line are examples of the size of the 2dFGRS and Lyman a power spectra 
statistical error-bars for one data point at different scales. There are 4 error bars for 2dFGRS and 4 for 
Lyman a . 




Fig. 8.— Two cosmological models: fi^ = 0.26, ujb = 0.02319, r = 0.12, = 0.953, = 0, /i = 0.714 
(sohd line) and Q.m = 0.26, uJb = 0.02319, r = 0.12, = 0.953, = 0.02, h = 0.6 (dashed line). As before 
the two models are virtually indistinguishable from the CMB angular power spectrum (left panel, units for 
the power spectrum are /xK^), but they can easily be distinguished if the matter power spectrum amplitude 
is known (right panel , units for the power spectrum are (Mpc h~^)^). The error bars on the solid line are 
examples of the size of the 2dFGRS and Lyman a power spectra statistical error-bars for one data point. 
There are 4 error bars for 2dFGRS and 4 for Lyman a . 



